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This paper concerns the estimation of sums of functions of ob- 
servable and unobservable variables. Lower bounds for the asymptotic 
variance and a convolution theorem are derived in general finite- and 
infinite-dimensional models. An explicit relationship is established 
between efficient influence functions for the estimation of sums of 
variables and the estimation of their means. Certain "plug-in" esti- 
mators are proved to be asymptotically efficient in finite-dimensional 
models, while estimators of Robbins are proved to be effi- 

cient in infinite-dimensional mixture models. Examples include cer- 
tain species, network and data confidentiality problems. 

1. Introduction. Given a pool of n motorists, how do we estimate the 
total intensity of those in the pool who have a prespecified number of traffic 
accidents in a given time period? This is an example of a broad class of 
problems involving the estimation of sums of random variables 

n 

(1.1) Sn = J2^{x,,ej) 

[24], where Xj are observable variables, 6j are unobservable variables or 
constants, and u{-,-) is a certain utility function. The estimation of (1.1) 
has numerous important applications. In the motorist example, Xj is the 
number of traffic accidents and 6j the intensity of the jth individual in 
the pool, and u{x, •&) = = a} for a prespecified integer a. In Sections 
3, 4 and 5 we consider applications in certain species, network and data 
confidentiality problems. 
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The estimation of (1.1) is a nonstandard problem in statistics, since the 
sums, involving observables, as well as unobservables, are not parameters. 
Without a theory of efficient estimation, the performance of different estima- 
tors can only be measured against each other in terms of relative efficiency. 
For the specific motorist example with u(x,i9) = 'dl{x = a}, Robbins and 
Zhang [28] proved that, in a Poisson mixture model, the efficient estimation 
of (1.1) is equivalent to the efficient estimation of E{9\X = a), so that the 
usual information bounds can be used. In this paper we provide a general 
theory for the efficient estimation of sums of variables. 

Let {X,9), {Xj,6j), j = 1, . . . ,n, be i.i.d. vectors with an unknown com- 
mon joint distribution F. Our general theory covers asymptotic efficiency 
for the estimation of 

n 

(1.2) Sn = SniF) = Y,uiX,,e,;F) 

i=i 

based on Xi, . . . , Xn, where the utility u{x,-&;F) is also allowed to depend 
on F. This provides a unified asymptotic theory for the estimation of (1.1) 
and conventional parameters u{F), since the utility is allowed to depend on 
F only. Our problem is closely related to the estimation of the mean 

(1.3) fi{F) = EFu{X,e;F). 

If Epu^{X, 9;F)<oo and l/2<a< 1, an estimator is n"-consistent for the 
estimation of Sn{F) iff it is n"-consistent for the estimation of its mean 
nfi{F) = EpSniF). But an efficient estimator of n^{F) is not necessarily 
an efficient estimator of Sn{F), since the two estimation problems may have 
different efficient influence functions, as we demonstrate below in (1.4)-(1.6) 
and in simple examples in Sections 2.3 and 2.4. The asymptotic theory for 
the estimation of n{F) is well understood; see [3, 17, 31]. 

Suppose that F belongs to a known class J^. Let Fq G J^. An estimator 
fin of (1.3) is (locally) asymptotically efficient in contiguous neighborhoods 
of Ppo iff 

1 " 

(1.4) fin = KFo) + (Xj) + op,^^ (n"i/2), 

where ipif{x) = ■iP^:{x;Fq) is the efficient influence function at Fq for the esti- 
mation of /i(i^). In Section 6 we show that, under mild regularity conditions 
on the utility functions {u{x,i); F), F G J^}, an estimator 5„ of (1.2) is (lo- 
cally) asymptotically efficient in contiguous neighborhoods of Pp^ iff 

(1.5) ^ = ^(Fo) + i f: MX,) + op,^ {n~'/'), 
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where = 4'*i^iP'o) is the efhcient influence function at Fq for the esti- 

mation of Sn{F). Furthermore, the following relationship holds between the 
two efficient influence functions in (1.4) and (1.5): 

(1.6) (j)^{x) =ip^{x) + u{x;Fo) -/i(Fo) -u^{x), 

where u{x;F) = Ef[u{X,6; F)\X = x] and u^{x) = n*(x;Fo) is the projec- 
tion of tt(a;;Fo) to the tangent space of the family of distributions {F-^ ,F G 
at F^. Here F^ is the marginal distribution of X under the joint distri- 
bution F of {X,6). It follows clearly from (1.6) that asymptotically efficient 
estimations of Sn{F)/n and n{F) are equivalent in contiguous neighbor- 
hoods of Pfq iff u{-;Fo) — fi{Fo) is in the tangent space, that is, u{-;Fq) — 
fl{Fo) =u^{-;Fo). 

We will derive more explicit results in finite-dimensional models and 
infinite-dimensional mixture models. In finite-dimensional models = {Fr, 
T G T} with a Euclidean r, it will be shown that "plug-in" estimators of the 
form J2^=i^i^j] l^r„) asymptotically efficient for the estimation of (1.2) 
if fn is an efficient estimator of r. In infinite-dimensional mixture models, 
certain estimators of Robbins [24] will be shown to be efficient for the 

estimation of (1.1). We shall consider estimation of (1.1) with known /(xji?) 
in Section 2 and provide the general theory in Section 6. Section 7 contains 
proofs of all theorems. 

2. Mixture models. Suppose {X, 6) r-^ F{dx, (I'd) = f{x\'d)iy{dx)G{d'd), thai 

is, 

(2.1) X\9^f{x\e), e^G. 

In this section we state our results for the estimation of (1.1) with known 
/(•I-)- 

2.1. Finite- dimensional mixture models. Let {G,-,r G T} be a paramet- 
ric family of distributions with an open T in a Euclidean space. Suppose 

(2.1) holds with G = Gr for an unknown vector t £T. Suppose that, for 
certain functions pr, 

I {^9^ - 1 - AW2)' dGr = o(|| A||2), 

(2.2) J . 

gr,AdGr = l + o{\\Af), asA^O, 

where 5t-,a is the Radon-Nikodym derivative of the absolutely continuous 
part of Gr+A with respect to Gr- Let Er denote the expectation under Gr- 
The Fisher information matrix for the estimation of r based on a single X 
is 



(2.3) Ir = C0Vr{Pr{X)), (x) = [p, (0) | X = x] . 

Define Ur{x) = Er[u{X, 6)\X = x] and pr = Eru{X, 9)- 
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Theorem 2.1. Suppose (2.2) holds, Et-u^{X,6) is locally bounded and 
I-r are of full rank for all t ^T. Then {Smn > 1} is an asymptotically 
efficient estimator of (1.1) iff (1-5) holds with /^(-Fq) = I^t, P = Pt, <ind the 
efficient influence function 

(2.4) (j)^=<j)^^r = Ur-HT + pUT^lT, 

where -fT = ErCoVr{u{X,e),pr{9)\X) = Er{u{X,e)pr{e) - Ur{X)pr{X)}. 

Remark 2.1. Since k^^t = ^r^Pr is the efficient influence function for 
the estimation of r and dpr/dr = Et-U{X,6)pt-{9), 'i/'*,r = PT^T^P^Tu{X,d)pT-{0) 
is the efficient influence function for the estimation of pr- Moreover, 11^,^^ = 
p\l~^ErUT{X)pr{X) is the projection of n,- to the tangent space generated 
by the scores Pt{^) under E^-. Thus, Theorem 2.1 asserts that (1.5) and 
(1.6) hold under (2.2). 

Our next theorem provides the asymptotic theory for plug-in estimators 

n 

(2.5) 5„ = ^n:;^JX,) 

i=i 

of (1.1), where Ur{x) = Et-[u{X,9)\X = x] as in Theorem 2.1. An estimator 
T„ of the vector r is an asymptotically linear one with influence functions 
Kr under E^ if 

1 " 

(2.6) Tn = -Y.Kr{Xj) + Op^{n~'/^), 

n ^ 

with Et-k,t-{X) p\.{X) being the identity matrix. 

Theorem 2.2. Let Sn he as in (2.5) with an asymptotically linear esti- 
mator Tn as in (2.6). Suppose conditions of Theorem 2.1 hold, Et-u'^^^{X) = 
0(1) as A^O for every t (zT, and for all t gT and c > 0, 



(2.7) sup 

||A|i<c/v^ 



^[Ur+AiXj) - Ur{Xj) - {ErUr+/^{X) - /X^}] 



op.(nV2). 



Let (jy^^r cind 7,- he as in Theorem 2.1 and «;*,r = It ^ Pt- Then 

(2.8) ^^l^^N{Q,al), cT2 = a2^+Var.(K(X)-K,,.(X)}V) 

under E^, where a'^^^. = Yarr{4'*,T{X) — u{X,6)). Consequently, Sn is an 
asymptotically efficient estimator of (1.1) at Erg iff ^toTu is an asymptoti- 
cally efficient estimator of ^r^'T contiguous neighborhoods of E^f^. 
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Remark 2.2. It follows from (2.8) that |5„ - Sn\ < IMa^^n^^^ pro- 
vides an approximate 95% confidence interval for (1.1), provided that a-j- is 
continuous in r. 

Remark 2.3. Condition (2.7) holds if {ur+A-T + A G T, ||A|| < 6r} is 
a Donsker class under Er for some 6r > and £^t-u^_^^(X) is continuous at 
A = 0. 

2.2. General mixtures. Let ^ be a convex class of distributions. Suppose 
(2.1) holds with an unknown G gQ. Let Eq be the expectation under (2.1). 
Suppose Egv?{X, 9) <oo for all G G t/. Define 

(2.9) gG,^\G:EGMG{X)/fG,{X)?<^.j /G/{/Go>0}di^ = l}, 
where /g^x) = J f{x\i})G{d'&), and define 

(2.10) Vgo = {v{x) : Egv{X) = Egu{X, 0) V G G }• 

Theorem 2.3. (i) I/Vgo nonempty, then {Sn,n > 1} is an asymptoti- 
cally efficient estimator of (1-1) at Egq iff Sn = {J21=i 
with 

(2.11) VGo^a.rgmm{EGMX) - u{X , e)f : v £ Vg,} ■ 



(ii) IJVgo empty, then there does not exist any regular n ^/'^ -consistent 
estimator of Egu{X,6) or Sn/n in contiguous neighborhoods of Egq- 

The definition of regular estimators of (1.1) is given in Section 6. 
Suppose that for certain the collection 

(2.12) V* = {v{x) : Egv{X) = Egu{X, e),EGv\X) < oo VG G g*} 

is nonempty, for example, certain Vgq as in Theorem 2.3(i). Let \\h\\G = 

{EGhHx)y/^. 

Theorem 2.4. Letvco (^-H)- Suppose vgq G V* and as {e,n) — > 

(0,oo), 



sup-^ 



E 



VG{Xj) - VGoiXj) 



n 



1/2 



\vg-vgo\\go <£,Geg* 



in Pgo 



for all Gq £Q^. Let G be an estimator of G such that Pgq{G G ^*) ^ 1 anc 



G 



(2.13) 



^^GoIIgo ^ m Pgo for all Go^G*- Then 

n 

K = E^g(^^) 
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is an asymptotically efficient estimator of (1.1) at Pgq for all Gq G 



If belongs to certain exponential families, there exists a unique 

function v such that Vgq / ^ implies Vgq = {v}, so that vgq = v for all Go 
and V* = {v}. The following theorem is a variation of Theorem 2.4 for such 
distributions. 



Theorem 2.5. Suppose /(x|t?) tx exp(x*A('!?)), A(t?) G A, is an exponen- 
tial family with an open A in a Euclidean space, and that the conditional dis- 
tribution of 6 given X(9) is known. Suppose Q contains distributions G = Gc 
with Eg\\{0) — c| = for all c€ A. If Vgo for certain Gq, then there 
exists a function v{x) such that 

(2.14) EG[v{x)\x{e) = c] = EG[u{x,e)\x{e) = c] yceA,Geg, 

and such that the following Vn is an efficient estimator of 5„, under {Eg '■ 
Egv^ (X) <oo}: 

n 

(2.15) K = E^(^j)- 

j=i 

Remark 2.4. Robbins [24] called (2.15) "u,?;" estimators, provided 
that (2.14) holds. The Vn in (2.13) can be viewed as a "u,v" estimator 
with an estimated optimal v. Theorems 2.4 and 2.5 provide conditions un- 
der which these two types of estimators are asymptotically efficient. 



2.3. The Poisson example. Let (X, y. A) = {X,9) with 
E[Y\X,X] = X, 

^ > /(x|A) = P(X = x|A) = e-^A7x!, 2; = 0,1,.... 

Robbins [22, 24] and Robbins and Zhang [25, 26, 27] considered the esti- 
mation of S'n = J2]=i^ju{Xj) and S'' = X]j=i ^'^(-^jOi ^-^id several related 
problems. 

Both 5,^ and S'^ are special cases of (1.1). For u{x) = I{x < a}, 5" could 
be the total number of accidents next year for those motorists with no more 
than a accidents this year in the motorist example. 

Suppose Xj have a common exponential density re~'^^ dX with unknown 
r. The marginal distribution of X is frix) = r(l -|- t)~^~^, and the marginal 
and conditional expectations of Xu{X) and Yu{X) are 



{x + l)u(x) 
1 + r 



00 



UTix) = , /x^ = ^ /^(x)xu(x - 1). 



x=0 
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Let X = ^2]=! Xj/n. Define f„ = (/? + n)/{a + ^^^^ Xj) and 

r217^ c -V^ - (a/n + X)(X, + lMX,) 

^ (a + /3)/n + l+X • 

It follows from Theorem 2.2 that the plug-in estimators in (2.17) are asymp- 
totically efficient for both S'^ and S'^. For a = /3 = 0, (2.17) gives the plug-in 
estimator corresponding to the maximum likelihood estimator (MLE) of r. 
For general positive a and /?, (2.17) gives the Bayes estimator of S'^ and S'^ 
with a beta prior on r/ (I + t). Clearly, /2„ = J2^=i{^nxu{x — 1)}/(1 -|-r„)^~''^ 
is efficient for the estimation of the mean = E.j-u{X, 6), but not for S'^/n 
or S'^/n. Similar results can be obtained for A with the gamma distribution; 
see [23]. 

In the case of completely unknown G{dX), the "tt,f" estimator (2.15) with 
v{x) = xu{x — 1) is asymptotically efficient for the estimation of S"^ and S'^ 
for all G with finite Eg{v{X) - An(X)}2. 



2.4. More examples. 



Example 2.1. Let X ~ N{T,a^). The number of "above average" indi- 
viduals, Sn = <n:Xj> X}, is an efficient estimator of the number of 
above mean individuals S'„(r) = #{j <n:Xj>T}. The estimator Sn = n/2 
is efficient for the estimation of E-j-Snir) = n/2, but not Snir). 



Example 2.2. Let f{x\-&)r^N{'&,a'^). An efficient estimator for the 
number of "above mean" individuals, 5„ = #{j < n:Xj > 9j}, is Sn = 
n/2, compared with Example 2.1. This is even true under the condition 
J2^=i = 0{l), that is, in contiguous neighborhoods of Pq with Po{Oj = 
0} = 1. 

Example 2.3. S„ = is efficient for the estimation of S'„(r) = X]?=i PriXj). 



3. A species problem. An interesting example of our problem is estimat- 
ing the total number of species in a population of plants or animals. Suppose 
a random sample of size N is drawn (with replacement) from a population of 
d species. Let be the number of species represented k times in the sample. 
A species problem is to estimate d based on {rik, k > 1}. The problem dates 
back to [13] and [14] and has many important applications [4]. We consider 
a network application in Section 4. 
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3.1. Finite- dimensional models. Let Xj be the frequencies of the jth. 
species in the sample, so that, for certain pj > 0, 

d 

(3.1) nk = ^I{Xj = k}, (Xi, . . . ,Xrf) ~ multinomial(Af,pi, . . . ,Prf). 

i=i 

We will confine our discussion to the case of {N,N/d) (cx3,/i), < fi < 
oo, since E{d — J2k^=i ^k) = Z)j=i(l ~ Pj)^ ^ as ^ cxd for fixed d. Let 
{Gt-,t G T} be a parametric family of distributions in (0,oo), where r is an 
unknown parameter with a scale component, Gr{y/c) = GT-^{y). Let be 
probability measures under which (3.1) holds conditionally on N and certain 
i.i.d. variables 6j > 0, and 

(3.2) p^. = ^L^, iV|{^,}~Poissonfc^0,V O.^G, 



with G = Gr. Under P^, are i.i.d. with Pr{Xj = k} = J e-y{y^/k\)Gr^{dy). 
Assume c = 1 due to scale invariance. Since no is unobservable, the MLE of 
(d, r) is 

f33) d- ^^=^"^^ f = arsmaxfTl-^^^:^^^^I^r 

^'•'^ Rl-e^y)G^{dyy "-"'?eT"iill-/e-^G.(<i,)/ " 

In the next two paragraphs we derive the influence function for the MLE 

(3.3) and prove its asymptotic efficiency. 

If (2.2) holds and the MLE r of r is asymptotically efficient, then 

1 

(3.4) r = T + -Y.^*AX,) + op{d~^'^) 

1=1 

with K^K^T- = {Cov,-(p^(X)}^^p^ and = I|^x~^Q^^{pr{x) — jr), where pr is as 
in (2.3) and = Et-[pt{X)\X > 0]. Thus, by the Taylor expansion of the d 
in (3.3), 

d 

(3.5) d = d + Y,<P*AXj) + op{d^^^), 

i=i 

where 4>*^r{x) = Ij^^^Qj/ Pr{X > 0) — 1 — Kl^^{x)^r- In this case, as d— > oo. 

For the gamma G{dy;T) oc exp(— y//3) dy, the MLE t = (a,/9) satisfies 

(37) Eg^fcn, dlog(l+^) dap 

^^a + k-1 i_(i + ^)-a' + 
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with (i = Efc^i^fc, and (3.4) holds [29]. Rao [19] cahed (3.3) with (3.7) 
pseudo MLE in a different (gamma) model, but the efficiency of the d was 
not clear [11]. 

The species problem is a special case of estimating (1.1) when d is viewed 
as the number of species represented in the population out of a total of n 
species. Specifically, letting pj = if the jth species is not represented in the 
population, estimating 

n n N 

(3.8) d = Y^ I{p, > 0} = ^ /{X, = 0, > 0} + 5] Hk 

j=i j=i k=i 

is equivalent to estimating (1.1) with u{x,p) = I{p > 0} or u{x,p) = I{x = 
0,p > 0}, based on observations {Xj,j < n}. Under (3.1) and (3.2) with d 
replaced by n, 



(3.9) Pp^AXj = k} = {l- p,)I{k = 0} + ^ > 0} 



je-y{y^/k[)Gr{dy) 
J{l-e-y)Gr{dy) 

with certain < /(I — e~y)GT{dy). Under (3.9), the f in (3.3) is the con- 
ditional MLE of T given {nk,k > 1}. Since Q^kLi ^k, d,n — d) is a trinomial 
vector, r in (3.3) equals the MLE of r based on a sample < n} from 

(3.9), provided that d in (3.3) is no greater than n. Since Pp^^r{d < n} — > 1 
under (3.9), by Theorem 2.1, the (conditional) MLE (3.3) is asymptotically 
efficient in the empirical Bayes model (3.2) under conditions (2.2), (3.4) and 
(3.5). 

3.2. General mixture. Now, suppose the distribution G in (3.2) is com- 
pletely unknown. The nonparametric MLE of (d, G) is given by 

J{l-e-y)G{dyy ''-'''^r''i\\l-Ie-yG{dy)! ' 

with d = J2k=i''T'k, but its asymptotic distribution is unclear. Since there is 
no solution v to the equation X^S^o v{x)e~^'d^ /xl = !{'& > 0} for < "i? < oo, 
by Theorems 2.3 and 2.5, the estimation of d with completely unknown G 
is an ill-posed problem. 

Among many choices, a compromise between (3.3) and (3.10) is to fit 
ErUk oc Pt{X = k) = J e~y{y''/k\)Gr{dy) for l<k<m. For gamma G with 
Euk+i/ Euk = {k + a)/3/(l -|- /3), fitting the negative binomial distribution 
yields 

(3.11) d = d -|- max(fi, 0)ni, d = n^, 

k=l 
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where fi is the (weighted) least squares estimate of ti = (/? + 1)/(q/3) based 
on 

rifc = nnfc+i + T2(A:nfc) + error, k = l,...,m-l, T2 = -l/a, 

with being a response variable and ^'^fc) being covariates for each 

k. For small 9j (large for small k), (3.11) has high efficiency for gamma G 
and small bias for G{y) = ciy" + (c2 + o(l))y"+^ at y w 0. Chao [5] proposed 
d + /(2n2) as a low estimate of d. Another possibility is to estimate d by 
correcting the bias of the estimator d/{l — ni/N) of Darroch and Ratcliff 
[9] as in [6]. 

4. Networks: estimation of node degrees based on source-destination data. 

Source-destination (SD) data in networks are generated by sending probes 
(e.g., traceroute queries in the Internet) through networks from certain source 
nodes to certain destination nodes; see [8, 32]. We shall treat SD data as a 
collection of random vectors Wj,j = 1, . . . ,N, generated from a sample of 
SD pairs and make statistical inference based on [/-processes of {Wj}, for 
example, 

^ iV ' 1^.4"^,, A^(A^-l) ' 

indexed by Borel hi and /12, where Wj are the observations from the jth SD 
pair in the sample. We focus here on the estimation of node degrees, although 
the approach based on (4.1) could be useful in other network problems. 

The topology of a deterministic network can be described with a routing 
table: a list ri,...,rj of directed paths representing connections between 
pairs of source and destination nodes, with each path being composed of a 
set of directed links. For example, the path 4 — > 2 — > 3 ^ 8 has source node 
4, destination node 8, and links 4 — > 2, 2^3 and 3^8. Consider a network 
with nodes {1, . . . ,K}. The link degree D(k,i) is defined as the number of 
paths using the link k ^ i, 

(4.2) D{k,e) = #{j < J: link k ^ i is used in rj}, 

with D{k,i) = if A; ^ £ is nonexistent or never used. The node degree, 
defined as 

K 

(4.3) dk = Y.I{D{k,i)>0}, 

e=i 

is the number of outgoing links from k to other nodes. This is also called out- 
degree. The in-degree, '^iI{D{l,k) > 0}, is the number of incoming links 
to k. The node degrees d^ and their (empirical) distributions are important 
characteristics of networks; see [12, 15, 30]. 
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For a given sample size A^, let . . . , Rn be a sample of SD pairs from the 
routing table {ri, . . . ,rj}. Suppose we observe the paths of Rj, so that the 
vectors Wj = (Wij, . . . , Wkj)' are given by W^j = iif link /e — > £ is used in Rj 
for some 1 < £ < K and Wkj = otherwise. The observed link frequencies 
are 

TV 

(4.4) Xki = #{j < N : link k ^ iis used in Rj} = ^ I{Wkj = i}. 

j=i 

Since Xke=o for D{k,l) = by (4.3), the node degree dk is a sum 

K 

(4.5) dk = dk + Sk, dk = J2 HXk£ > 0}, 



where d^ is the observed degree and Sk is the unobserved degree given by 

K 

(4.6) Sk^y2l{Xki = 0,D{k,£)>0}. 



Lakhina, Byers, Crovella and Xie [16] and Clauset and Moore [7] pointed 
out that the observed degrees d^ may grossly underestimate the true node 
degree dk- 

It follows from (4.5), (4.6) and (3.8) that the problem of estimating the 
node degree (4.3) is a species problem. From this point of view, we may di- 
rectly use estimators in Section 3 and references therein, for example, (3.11). 
However, in network problems, we are typically interested in simultaneous 
estimation of many node degrees. Thus, information from {Xke,i < K} can 
be pooled from different nodes k. Let /C C {1,...,X} be a collection of 
"similar" and/or "independent" nodes. Let ^ be a family of distributions, 
for example, gamma with unit scale. Suppose the G in (3.2) for different 
nodes are identical to a member of G up to scale parameters (3k- Then, as 
in (3.10), the (pseudo) MLE for {dk,(3k,k G /C,G} is given by 



dk = 



Ejli rikj 4>o Gjdy) 



a7^ J{l-e-P^y)G{dy) 

mn^- TT Ar Je-^'^yy^Gidy) 
(/3,G =argmax II |N -a^^. , n { ' 

where /?=(/?,..., Pk) and the maximum is taken over all > and G gQ. 
This type of estimator is expected to perform well for self-similar networks. 

In the nonparametric case of completely unknown G, the MLE (/3, G) in 
(4.7) can be computed via the following EM algorithm: 

Mj + l;/3f\GM) ^ p(1;/3^),GM) ^ 
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with p(i;A,G)^/e-^^ VG(t^y), 



N 



^ G^^\d^) 5] -p(0;/3f 

\fcG/Cj = l / 

5. Data confidentiality: estimation of risk in statistical disclosure. A ma- 
jor concern in releasing microdata sets is protecting the privacy of individ- 
uals in the sample. Consider a data set in the form of a high-dimensional 
contingency table. If an individual belongs to a cell with small frequency, an 
intruder with certain knowledge about the individual may identify him and 
learn sensitive information about him in the data. Statistical models and 
methods concerning the risk of such breach of confidentiality have been con- 
sidered by many; see [10] and the proceedings of the joint ECE/EUROSTAT 
work sessions on statistical data confidentiality. For multi-way contingency 
tables, Polettini and Seri [18] and Rinott [21] studied the estimation of global 
disclosure risks of the form 

J 

(5.1) Sj^Y.^{X„Y,) 

i=i 

based on {Xj,j < J}, where Xj and Yj are the sample and population 
frequencies in the jth cell, J is the total number of cells, and u(x, y) is a loss 
function of the form u{x,y) = u{x)/y, for example, u{x,y) = y~^I{x = 1}. 
Let N = J2j=i be the population size. Suppose N ~ Poisson(A), 

{Yj}\N multinomial(iV, {ttj}), Xj \ {{Yj } , iV) ~ binomial(yj ,pj), 

(5.2) 

for certain ttj > with J2j=i'^j = 1) < < 1 and A > 0. For known 
{pj,7rj. A}, the Bayes estimator of Sj in (5.1) is 

J 

(5.3) S*j = E{Sj\{Xj}) = Y,nj{Xj), Ujix) = Eu{x,Yj - Xj + x), 

i=i 

with Yj — Xj ~ Poisson((l — pj)-Kj\) (independent of Xj). For u{x,y) = 
y-H{x = l}, 

(5.4) n,(x) = {(1 -p^)n^X}-^[l-exp{-{l -p,)7r,A}]. 

In general, the parameters (1 — pj)TTjX cannot be completely identified 
from the data Xj ~ Poisson(pj7rjA), so that it is necessary to further model 
the parameters. This can be achieved by setting {pj,TTj, A} to known tractable 
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functions of an unknown vector r and certain covariates Zj characterizing 
cells j, and by incorporating all available knowledge about the parameters, 
for example, N and J2j=i Pi^j ~ n/N ^ where n = X]/=i is the sample 
size. Consequently, the conditional expectation Uj{x) in (5.4) can be written 
as Uj{x) = u{x,Zj]T). This suggests 

J 

(5.5) Sj = J2'^i^j^^j'^^j) 

as an estimator of the global risk (5.1) and its conditional expectation (5.3), 
where tj is a suitable (e.g., the maximum likelihood or method of moments) 
estimator of r. For example, in a two-way table with cells labelled by j ~ 
(i, k) and known TTj^^ and A, we may assume a regression model pi^k = i^o{Ti + 
T2Zi±) for a certain known (e.g., logit or probit) function ipQ. In the case of 
unknown vrj^fc, we may consider the independence model vTj^fc = vTj.Tr.fc with 
unknown VTj. and known or unknown tt.^. If r has fixed dimensionality and 
Tj is asymptotically efficient, (5.5) is efficient by Theorem 2.2. Theorem 2.2 
also suggests that (5.5) is highly efficient if dim(r)/J — > 0. 

Alternatively, we may consider the negative binomial model N ~NB(a, 1/ 
(1 + /?)), that is, P{N = k)= T{k + a){T{a)k\}-^ (5^ / {I + (3)^+'^. As in 
[21], we have in this case Yj ~ NB(q!,1/(1 + f3j)) with f3j = p-rrj, Xj ~ 
NB(a, 1/(1 +pj(3j)), and {Yj - Xj)\{Xj = x} ~ NB(x + a, {l+pj(3j)/{l + 
Pj))- Consequently, 

(5.6) u,{x) = 7^^^ r t''^-UtI{x = l} 

(1-Pj)/?J Al+P./3.)/{l+/3.) 

in (5.3) for u{x,y) = y~^I{x = 1}. Bethlehem, Keller and Pannekoek [2] 
studied this negative binomial model with constant ttj = 1/J and pj = 
En/ EN w n/N. For (aj,/3j) (0, oo), {Yj - X.j)\{Xj = x} converges in dis- 
tribution to the NB(x,pj), resulting in the ^-ARGUS estimator [1] with 
Uj{x) =pj{l — pj)~^{— logpj)I{x = 1} in (5.6), as pointed out by Rinott [21]. 
Compared with the Poisson model in which A ~ A^, estimates of both EN 
and Var(A) are required in the negative binomial model. The //-ARGUS 
model essentially assumes Var(A)/(ii^A)^ oo, which may not be 

suitable in some applications. 



6. General information bounds. We provide a lower bound for the asymp- 
totic variance and a convolution theorem for (locally asymptotically) regular 
estimators of the sum in (1.2). To facilitate the statements of our results, we 
first briefly describe certain terminologies and concepts in general asymp- 
totic theory. 
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6.1. Scores and tangent spaces. Suppose {X,9) ~ F with F & J^, where 
^ is a family of joint distributions. Let C = C{Fq) be a collection of mappings 
{Ft,0< t < 1} from [0, 1] to satisfying 

(6.1) EFo{VMX)-l-tp{X)/2f = o{t^), EFjt{X) = l + o{t^), 

for certain score functions p{x) = p{x;{Ft}) depending on the mappings 
{Ft}, where ft = dF^ /dF^ is the Radon-Nikodym derivative of the abso- 
lutely continuous part of the marginal distribution F^ of X under Ft with 
respect to the marginal distribution F^^ . Let C* =C^{Fq) be the collection 
of score functions p{X) generated by C. The tangent space H^, = H^{Fq) is 
the closure of the linear span [C*] of C* in L2{Fq); that is, 

(6.2) H,='\Cr], C. = {p{-,{Ft}):{Ft}eC}. 

For further discussion about score and tangent space, see [3], pages 48-57. 
The second part of (6.1) holds in regular parametric models; see [3], page 
459. 

6.2. Smoothness of random variables and their distributions. Let C{U ; F) 
be the distribution of U under Pp. Suppose that, for all {Ft} E C, the ran- 
dom variables up^ = u{X,9; Ft) and up^ = Ep^[up^\X] satisfy the continuity 
conditions 

(6.3) ^lim VarjT'o (HiT'j — up^) = 0, 

(6.4) C{wF^;Ft) C{wpo;Fo), Ep^w% Ep^Wp^, 

as t ^ 0+, with wp = up — up, and also satisfy the differentiability condition 

(6.5) ^hm Ep, {up, - up,)/t = Ep,^{X)p{X) 

for certain (j){X) = <^(X;Fo) G L2{Fq). The usual smoothness condition for 
p{F), see [3], pages 57-58, is that, for a certain influence function ipi^X) = 
V'(X;Fo)eL2(Fo), 

(6.6) i^^V^-^^^*) - /"(^o)}/* = Ep,i;{X)p{X). 

6.3. Regular estimators. An estimator pn = Pn{Xi, . . . ,Xn) of p{F) is 
(locally asymptotically) regular at Fq if there exists a random variable Co 
such that 

(6.7) lhn^C{n^/^{fin - KFc/^)}; F,/^) = C{Co;Fo) 

for all c > and {Ft} G C ([3], page 21). Likewise, for the estimation of 
the sum Sn{F) in (1.2), we say that an estimator Sn = Sn{Xi, . . . ,Xn) is 
regular at Fq if there exists a random variable such that, for all c > and 
{Ft} G C, 

(6.8) lim^Cin-'/'{Sn-Sn{F,/^)};F^/^) = C{^o;Fo). 
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6.4. Efficient influence functions and information bounds. Let be the 
projection of ■0 in (6.6) to the tangent space in (6.2). The standard 
convolution theorem ([3], page 63) asserts that, for a certain variable Cqi 

£(Co; Fo) = iV(0, E^l^l{X)) * £(C^; Fq) 

for the Co in (6.7), and that efficient estimators are characterized by (1.4). For 
h E L2(Fo), let Anih) = E"=i h{Xj,ej)/n and Z„(/i) = ^{An{h) - EfM- 

Theorem 6.1. Suppose (6.3), (6.4) and (6.5) hold at Fq. Let 4>^fl he 
the projection of (j) in (6.5) into the tangent space H^, in (6.2), and let 
= - niFo) + (p^^o . 

(i) // (6.8) holds, thenYaipoi^o) ^^^'^Fo{4>*~'^Fo)- Moreover, the lower 
hound is reached without hias, that is, -Efq^o = ^^'^Fo{4>* ~ ufq), iff (1-5) 
holds. 

(ii) // (6.8) holds and the L2{Fq) closure C^, of in (6.2) is convex, 
then there exist a random variable o.nd certain normal variables Z{h) ~ 
N Fo{h)) such that 

r((V^{Sn/n-An{<p,)-fi{FQ)}\ \d (( lo Vp 

Zr,{uF, + h-UF,) )' V ^\\Z{uF,+h-UF,))''^'' 

and ^0 is independent of Z{uFg + h — ufq) for all h € if*. In particular, for 

£(eo; Fq) = C{Z{<t>, - uf,);Fq) * C{io; Fq). 

(iii) Suppose EFtu'^{X; Ft) is bounded for all {Ft} G C. Then, if^^, = 4>*fl + 
is the efficient influence function for the estimation of fi{F), that is, (6.6) 

holds with tp = xp^, where is the projection of ufq to H^,. Consequently, 
(1.6) holds. 

Remark 6.1. Based on Theorem 6.1(i) and (ii), Sn is said to be locally 
asymptotically efficient if (1.5) holds. Note that in Theorem 6.1(ii), = 
iff (1.5) holds. 

Remark 6.2. In the proof of Theorem 6.1(iii), we show that (6.5) and 
(6.6) are equivalent under the condition that EFtu'^{X; Ft) = 0(1) for all 
{Ft} G C. 

Remark 6.3. For the estimation of fJ,{F), that is, u{x,-d,F) = fi{F) 
as a special case of Theorem 6.1(ii), a standard proof of the convolution 
theorem uses analytic continuation along lines passing through the origin in 
the tangent space, and as a result, is often assumed to be a linear space. 
In the proof of Theorem 6.1(ii), analytic continuation is used along arbitrary 
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lines across C^, so that only the convexity of is needed as in [31], pages 
366-367. Rieder [20] showed that, in the case of convex C*, the projections 
of scores to (not to H^,) are useful in the context of one-sided confidence. 

6.5. Finite- dimensional models. Let = {Ft-,t £ T} with an open Eu- 
clidean parameter space T. We shall extend the results in Section 2.1 to 
general sums (1.2). Suppose dF^ = dv exists and is differentiable in the 
sense of (6.1), that is, 

(6.9) J {fli\ - - Ap^f du = o{\\Af), T e r. 

Let Et = Ep^, Ij- = CoVt-(pt-(X)), = u{X, 6; F^-) and 11^ = u{X; Ft). 

Theorem 6.2. (i) Suppose (6.9) holds, Ir is of full-rank, C{ur',Fr) is 
continuous in t in the weak topology, Et-u^ is continuous, Et-{ut-+a — Ut}"^ 
as A — > 0, E-rU^ is locally bounded, and h'{t) exists. Then (2.4) gives the 
efficient influence function for the estimation of (1.2) with 7^ = /u'(t) — 
Ej-UtPt! md (1.5) and (1.6) hold. 

(ii) Suppose (2.6), (2.7) and conditions of (i) hold. Then (2.8) holds for 
the plug-in estimator (2.5) with the jr in (i)- In particular, (2.5) is asymp- 
totically efficient under Pr iff ^r^^r = 1tIt^ Pt- 



Remark 6.4. Comparing Theorem 6.2 with Theorems 2.1 and 2.2, we 
see that (6.9) is weaker than (2.2) and (1.2) is more general than (1.1), while 
stronger conditions are imposed on in Theorem 6.2. 



7. Proofs. We prove Theorems 6.1, 2.1, 2.2, 6.2, and 2.3-2.5 in this sec- 
tion. 



Lemma 7.1. Suppose (2.2) holds. Let {X,9)^Ft under Pr+at o,nd p = 
Pr for a vector a, where pr is as in (2.3). Then (6.1) holds with Pp,j = Pr. 



Proof. Let gt = Qr+at and A = at. The lemma follows from the expan- 
sion 



1 



P 
2 



1 



fl'^ + l 



En 



r 1/2 
^9t 



1 



X = x 


— Eq 


'a^Pr 


X = x 




2 





The uniform integrability of the square of the right-hand side (i.e., the first 
term) under fo{x) follows from the inequality £'o[5t|X] < /j(X)/{/o(X) > 
0}. We omit the details. □ 
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Lemma 7.2. Suppose (6.1) holds and X ~ F/^ under Pt, < t < 1. Let 
fit = Etht(X) for a certain Borel hf. If Ethf{X) =0(1) and ht /iq in 
L2{Po), then 

Ht-f^o = Eo{ht{X) - ho{X)} + tEop{X)ho{X) + o{t) ast^O. 

Proof. Let Bt be the support sets of dPt{X) - ft{X) dPo{X). By (6.1) 
and the boundedness of Eth^, Eth - Eoftht = EthlBt = 0{l){Eth^y/'^ x 
P//'(Bt)=o(t). Thus, 

(7.1) pt-f^o = Etht - Eoho = Eoift - l)ht + Eoiht - ho) + o{t) 

as t ^ 0+. Since (VA " 1)A ^ /o/2 m L^iPo) and Fo{(VA + ^)ht}^ = 0(1), 

Eoift - l)ht/t = Eo[t-\^t- l){^t + l)ht] ^ Eohop. 
This and (7.1) complete the proof. □ 

Proof of Theorem 6.1. Let F„ = F^j^, in = y/n{Sn/ n - Sn{Fn) / n] , 

in = Vn{Sn/n - An{uFn)}, S.n = V^^n{wFn) and Z" = Z{wFo)- Then ^„ = 
^'n + and depend on {Xj} only. By (6.4), under Pp^ are uniformly 

integrable and C{wf„ ; En) C{wfq ; Eq) as n — > oo. Thus, by the Lindeberg 
central limit theorem and the weak law of large numbers, 

(7.2) EFAe^v{itO\{X,}]^ EF.e^viitZ") 

in probability for all t. Since depends on {Xj} only, this and (6.8) imply 
Ef„ exp{iti'jE exp{itZ") = Ef„ exp(it^^) exp(it^") + o(l) Efo exp(it^o)- 
Thus, since E exp(itZ") ^ for all t, 

(7.3) £(^n-i/2|5„-X:^(X,;F,/^)|;F,/^^ = Ci^^; En) ^ C{C'o; Fo) 

for a certain variable independent of c> and the curve {Ft} £C. 

Define Cn,o = V^{Sn/n- An{uFo)}- By (6.3) and (6.5), C'nfi-Cn = V^An x 
{uf„ — ufo) = FFoiuF„ — ufo) + op(l) — > cE(j){X)p{X) in probability under 
Pfo- Thus, as in [3], pages 24-26, by (7.3) and the LAN from (6.1) and (6.2), 

(7.4) Efq exp{iti'Q + zZ{p)) = exp[itzEFo(l)p + z^Ef^p^ /2\Efq exp(ft^Q) 

for all p gC^: and complex z. Here Z{h) are constructed so that {Cn,o^ Zn{h)) 
converges jointly in distribution to (^q,Z(/i)) for all /i G L2{Eq). Differenti- 
ating (7.4) in t at t = and then in z at z = 0, we find 

(7.5) EF,i'oZ{h) = EF,(l){X)h{X) = EF,Z{ct>.,o)Z{h) 
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for all scores h = p, p£ C^:, and then for all /i G by (6.2). Since (p^:^ G H^, 
— Z{(j)^:fi) and Z{(j)^fi) are orthogonal in L2{Fq). This proves (i), since 
^0 and Z{(p^fi) are both independent of Z" by (7.2) and Z((/)*^o) + Z" = 

Z{(f)^-UFo)- 

Now, suppose is convex in L2{Fq). By continuity extension, (7.4) holds 
for all p^C^ and complex z. Let pj G C*. Since (7.4) holds for p = spi + (1 — 
•s)/32)0 < s < 1, with both sides being analytic in s, by analytic continuation 
it holds for p = sp\ + (1 — s)p2 for all real s. Thus, (7.4) holds for all complex z 
and 

(7.6) p£ Ho = {spi + (1 - s)p2 : pj G C*, -oo < s < oo}. 

Let H be the linear span of a set of finitely many members of C*. Let pi 
be a fixed interior point of H f] C^, and P2 ^ H with \\p2 — pi\\ = Sq. For 
sufficiently small (5o > 0, p2 £ C=k for all such p2, so that H C Hq. Thus, Hq is 
a linear space and is the closure of Hq. It follows that (7.4) holds for all 
pG H^: and complex z. As in [3], pages 25-26, this implies the independence 
of — Z{(f)^:fi) and {Z{h) : h G H^}. Since {^q, Z{h), h G -fT*} is independent 
of Z" = Z{upq — upo) by (7.2), the conclusions of part (ii) hold with = 
Co - 2'(V'*,o)- 

The proof of part (iii) follows easily from Lemma 7.2 with ht = UFt, which 
gives 

{p{Ft) - p{Fo)}/t - EpoiuFt - UFo}/t EfqUfoP = EfqU^P- 

It follows that (6.5) and (6.6) are equivalent under Eptu'^iX; Ft) = 0(1), 
with ^ = V'* = It* + <P*,o, by (1.6) and the definition of The proof is 
complete. □ 

Proof of Theorem 2.1. The proof is similar to that of Theorem 6. 1 (i) , 
so we omit certain details. By (2.2), is independent of Z{pr) under P^-. 
Since ErU? < oo, (7.2) holds for fixed F„ = F^, so that Co = Co + Z{ut- — u) 
as a sum of independent variables. Let Z{hr) be the projection of Co to 
{Z(/i),/i G L2(F^)} in L2(Pt) and Vr = K + Ur- ThenVar^(Co) >Er{vr-uf 
and ET-{vr — u)pr = 0. Since Co is the limit of variables dependent on {Xj} 
only, hr and Vr depend on X only. 

Since -Er^^^9r,A(^) < E-j-^^u^ = 0(1), by (2.2) and Lemma 7.2 with ht = 
ho = u{x,'d), Pt+a - /^r ~ A^ErUpT = A''Et-iP^,t{X)pt-{X), where = 

p\l~^ErUPr- It follows that = Er{Vr — u)pr = Er{VrPT — i^^^rPr) = Er{Vr — 
il*^r)PT- Thus, Er{Vr — Ut) Pt = -E't(''/'*,t " U*,t)Pt with U^^r = Pt^t^^tUtPt- 

Since V'*,r — w'*,r is linear in p-r, Z{vr — Ur — (V'*,r — ^i*,T)) is independent 
of Z{jp^^r — tt^^r)- Thus, Vairivr — Ur) > Var^(V'*,T — w'*,t)) and Var^(Co) > 
VairiuT — lir) + Var^(tt-r — u)> VaiCT-i'P*,T — u) by (2.4). The proof is complete. 

□ 
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Proofs of Theorems 2.2 and 6.2. Theorem 6.2(i) follows from The- 
orem 6.1 and Remark 6.2. Let r) = Ej-Uf{X). By Lemma 7.2, ^' = E^-up 
in Theorem 2.2 and = {d/dt)fi{T;T) in both theorems. Simple expansion 
of (2.5) via (2.7) yields 

S 

— = An{ur) + {P'{rn;T) - /i(r; r)} + op^ (n"^/^) 
n 

= An{Ur + 7tKt) + Op^{n~'^/'^), 

which implies (2.8). Note that 7t(kt- — K*,r) is orthogonal to — "Ur + 7TK*,r- 
The proof is complete. □ 

Proofs of Theorems 2.3, 2.4 and 2.5. Let Gj = (1 - t)Go + tG, 
ft = fct and Et = EG,,t> 0. By (2.9), (6.1) holds with p = fc/fo - I- Since 
oo, ti^ are uniformly integrable under Pt, so that (6.4) holds. Since 
fo/ ft < 1/(1 — t), {ut,0 < t < 1/2} are uniformly integrable under £"0, so 
that (6.3) holds. Moreover, 

(7.7) Eo{ut -uq} = Eo^^^{uG -uo)^ ^ Eo^^^{ug -uo) 

Suppose there exists a regular estimator of (1.1). Let be as in (7.5) and 
let Z{v — uq) be the projection of to {Z{h),h E -L2(/o)} as in the proof of 
Theorem 2.1. It follows from (7.7) and the argument leading to (7.5) that 

Eo{v - no)(/G//o - 1) = EoZ{v - uo)Z{p) = Eo\^{ug- uo) 

I Jo 

which implies Egv — Eqv + Equ = Egu. Since does not depend on the 
choice of G G Qgo^ ^ ^ ^Go- the Lindeberg central limit theorem, Egqv'^ < 
00 and V £ Vgq imply C{Zn{v — u);P^i^) C{Z{v — u); Pq); so that Vn in 
(2.15) is regular at Go for all v G Vgq - If w is a limit point of Vgq in L2{fo), 
Vn is also a regular estimator of Sn at Pq, so that Vgq is closed in L2(/o). 
This completes the proof of Theorem 2.3. 

The proof of Theorem 2.4 is similar to those of Theorems 2.2 and 6.2 
but simpler. We note that Egq{vg — vgq) = 0- Finally, Theorem 2.5 follows 
from the fact that Vg contains a single function v due to the completeness 
of exponential families. The proofs are complete. □ 
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